perm filename DOC[AP,SYS] blob
sn#010733 filedate 1972-08-03 generic text, type T, neo UTF8
COMMENT ⊗ VALID 00010 PAGES
RECORD PAGE DESCRIPTION
00001 00001
00002 00002 USING THE ASSOCIATED PRESS EXTRACTER:
00006 00003
00009 00004 USING THE ASSOCIATED PRESS HOT LINE:
00011 00005 HOW THE ASSOCIATED PRESS EXTRACTER WORKS:
00013 00006 DESCRIPTION OF THE FILES USED:
00017 00007 FILE USAGE FOR THE AP NEWS FILING, CATALOGING AND RETRIEVING SYSTEM.
00019 00008 FILER:
00023 00009 DOER:
00026 00010
00027 ENDMK
⊗;
USING THE ASSOCIATED PRESS EXTRACTER:
YOU TYPE: IT ANSWERS:
_____________________________________________________________________
R APE a. WAITING FOR GODOT...
b. KEYWORDS:
___________________
In this initial step, you have indicated a desire to read
the news from the AP (Associated Press) news line.
a. The program has to wait for another program to let go of a file.
If you wait, it should come back with KEYWORDS:.
b. The program has responded with a request for keywords.
It knows roughly 700 keywords (at present) and most of them are
proper nouns. Your answer to the question KEYWORDS: should
be to type in words that you are interested in reading about,
followed by an altmode (represented below by $).
[expand this section]
_____________________________________________________________________
KEYWORDS:
(VIETNAM+KOREA)*FOO$ a. *** UNRECOGNIZED KEYWORD: FOO ***
b. *** MISSING RIGHT PARENTHESIS ***
c. *** MISSING KEYWORD ***
d. *** SYNTAX ERROR ***
e. NO NEWS ITEMS FOUND
f. 004 NEWS ITEMS FOUND
Read them now?
__________________
a. This means that FOO is not a keyword. If you would like specific words
added to the list of keywords, SEND a note to ME or EMC.
b. This is obvious.
c. This means the parser was expecting a keyword but found something
else (like an operator or a right parenthesis).
d. The parser thought it was finished, but your input string still had
characters in it (not counting spaces, tabs, and CRLFs).
e. There has been no news about your keywords during the last
approximately 24 hours.
f. Success.
__________________
B.(alt) 004 NEWS ITEMS FOUND
Read them now?
This feature of not typing any keywords allows you to review the stories for the
last keyword you have typed.
_______________________________________________________________________
______________________________________________________________________
004 NEWS ITEMS FOUND
Read them now?
3a.Y (The four stories will be output to your console.
They are in the order of the most recent first,
and separated by a row of stars (*****). Also,
if there are corrections or additions, or if the
story has been broken into two smaller stories
(takes), all will be output together as one story.)
KEYWORDS:
b.N KEYWORDS:
C.(anything else) Direct the news where?(Tty,Spooler, and/or File)
______________________________________________________________________
Direct the news where?(Tty,Spooler, and/or File)
4.(any combination of:)
a. T (or TTY) (Just like typing "Y" to "Read them now?".)
b. S (or SPOOLER) (if only S) @@@@
KEYWORDS:
(if ST: Just like typing "T".)
(if SF: Just like typing "F".)
c. F (or FILE) Type filename (the extension .AP will be used):
_____________________
b. If only S is your reply, one "@" will appear for each story found as they
are read and filed. They are filed in a temporary file $NEWS0.AP which is
deleted after it is spooled. If $NEWS0.AP exists, then $NEWS1.AP is tried, etc.
_________________________________________________________________________
Type filename (the extension .AP will be used):
5. FOO a. FILE ALREADY EXISTS!
Type filename (the extension .AP will be used):
b. @@@@
KEYWORDS:
__________________________________________________________________________
USING THE ASSOCIATED PRESS HOT LINE:
YOU TYPE: IT ANSWERS:
RU HOT a. WELCOME TO THE AP HOT LINE...
b. (Cannot contact FILER (hot line supervisor program).
_____________________
Since HOT tries to contact the supervisor program, there may be a
pause before any response is obtained from the computer.
a. You have just caused an interrupt in FILER and will now begin to receive
the news as it comes over the line. The news should
come a buffer at a time, with a pause in between each buffer.
However, there is the chance that no news at all is coming over the line
in which case you could sit there with no news whatsoever.
b. Something is wrong with the program that reads the news from the AP line.
Sorry, you loose.
HOW THE ASSOCIATED PRESS EXTRACTER WORKS:
Six programs are needed to categorize the news. They are,
with brief descriptions:
FILER: The program that reads the stories from the line, converts them to
ascii, and eventually files them into the NEWS file. It also
updates a pointer in the INDEX file to point to the new story it has
just filed. FILER sends output to the HOT line, and starts DOER and
INITER ptys.
DOER: The program that is started when FILER finishes with a story. It reads
in the new news story and, after alphabetizing the words, searches
for the words in the DICT (dictionary of keywords) and fixes the
appropriate links in the LINKS file.
APE: The user program. It uses the links prepared by DOER in retrieving stories
by keyword.
INITER: The initializing program. It is run on the first time the AP News system
comes up, and initializes WORDS, DICT, AND LINKS. It also takes input from
the list of sorted keywords WORDS.SRT and puts them in the form of
DICT. If the entries of the dictionary are to be changed then this
program must be run.
HOT: Another user program. This program simply sets a bit in FILER
corresponding to the User job number.
SORT: Sorts the keywords in WORDS.TXT and puts them in WORDS.SRT.
DESCRIPTION OF THE FILES USED:
DICT LINKS INDEX NEWS
__________ __________ __________ _________________
| | | | | | | | | | | |
| | | | | | | | | | | |
| 1 | 2 | | 3 | 4 | | 5 | 6 | 7| | |
| | | | | 0 | | | | | | |
| | | | | ↑ | | | | | | |
| | ↓|←←←←|←←←↑| ↑ | | | | | | |
| | X→→→→→|X X| X X|→→→| X| | | | |
| | | |↓ ↑| | | ↓| | | | |
|X__|x___| |0__X|_____| |__0|___|__| |________________|
↓ ↓→→→→→→→→→→→→→→→→
_↓______________ _↓______________
| | | |
| WORDS | | MULTS |
| | | |
|_______________| |_______________|
DICT: Dictionary of keywords.
1. Left half: holds pointer into WORDS, which holds the actual keywords
Right half: vacant (possible site for Twin keyword link).
2. Left half: holds pointer to MULTS (multiple word keywords, ex: UNITED STATES).
Right half: holds pointer to first occurrencce of this word in a story
LINKS: Holds pointers to all words in the same story and all stories with the same word
3. Left half: holds pointer to the same word in a different story.
Right half: holds back pointer to same word in a different story.
4. Left half: holds the pointer to a different word in the same story.
Right half: holds the pointer to the index for this story.
INDEX: Points to the actual stories in NEWS.
5. Left half: backpointer to the first word in this story.
Right half: holds the pointer to adds, corrections, and multiple
takes for this story.
6. Left half: holds the record number where this story can be found
in NEWS
Right half: holds the displacement of the story from the beginning of
the record.
7. Holds the number of the story as it came over the wire. Used in looking
back to link together multiple takes and adds and corrections.
NEWS: Holds the actual news stories in ascii.
WORDS: Holds the keywords in ascii. Keywords are limited to 20 characters.
MULTS: Not implemented yet, but will hold the second word in multiple word
keys, much like WORDS.
WORDS.TXT: Input file of keywords to be sorted
WORDS.SRT: File of keywords after they are sorted.
;FILE USAGE FOR THE AP NEWS FILING, CATALOGING AND RETRIEVING SYSTEM.
------------------------------------------------------------------------------
SOURCE FILES DESCRIPTIONS
FILER PROGRAM TO READ FROM AP LINE AND FILE STORIES.
DOER PROGRAM TO CATALOG STORIES.
APE PROGRAM TO RETRIEVE STORIES CONTAINING GIVEN KEYWORDS.
INITER.SAI PROGRAM TO INITIALIZE THE FOLLOWING FILES: WORDS, DICT, LINKS.
SORT PROGRAM TO SORT THE FILE WORDS.TXT INTO THE FILE WORDS.SRT.
HOT PROGRAM TO OUTPUT AP NEWS AS IT COMES IN OVER THE AP LINE.
------------------------------------------------------------------------------
DATA FILES DESCRIPTIONS
NEWS AP NEWS STORIES.
INDEX INDEX INFORMATION INTO NEWS FILE.
LINKS LINKS CONNECTING ALL WORDS IN SAME STORY AND ALL STORIES WITH SAME WORD.
DICT POINTERS TO WORDS FILE AND LINKS FILE INDICATING CATALOGING OF STORIES.
RELATS POINTERS TO: TWIN, SON, BROTHER.
MULTS MULTIPLE WORD KEYWORDS.
WORDS CHARACTERS IN THE KEYWORDS IN THE DICTIONARY.
WORDS.TXT TEXT FILE USED TO TYPE IN DICTIONARY FOR INITIALIZATION.
WORDS.SRT SORTED VERSION OF WORDS.TXT
FILER:
1. INITIALIZATION: Most of this part of the program is done only
the first time the NEWS program comes up. It assures
the program of having the necessary files to work. If no NEWS file
exists, NEWs and INDEX files are created. INITER is also run from
this part of the program. If DOER isn't running, it is started
on a pty after INITER is finished.
2. SEARCHING FOR THE BEGINNING OF A STORY: FILER waits for characters
to appear in the AP buffer. When some do, it converts them to ascii
and then sends a letter containing the news to all of the
jobs running the HOT line. It then checks the buffer and sees if
it contains the beginning or the middle of a story.
If the latter is true, it throws out the characters and waits for some more.
If the beginning of a story (marked by "A digit digit digit"LF),
is in the buffer, all the garbage before the story is thrown out and we
proceed to 3.
3. SEARCHING FOR THE END OF A STORY: Here FILER does the same character
grabbing from the buffer that it did before except that it now looks
for the three LF's signalling the end of a story, and it deposits
and sends to the HOT line all the characters until it has found the end.
4. PREPARE TO WRITE OUT THE NEWLY READ STORY: News stories are not allowed
to end on records, so if this one does, we add an extra word of null bytes
to the end of it. The INDEX file is opened and the special pointers to
the oldest story (OLD),and the place for the new story(NEW) are read in.
The length of the newly read story is added to the end of the last story
in NEWS, and if there will be some problem in fitting in
the story (if it will overlap past OLD, or if it won't fit into
the bottom of the file without wrapping around) fixups are undertaken. If
an old story must be deleted, all the words associated with that story
are returned to the available list in LINKS and pointers are cleared in
DICT. OLD is then updated
5.WRITE OUT NEWS STORY: Now the NEWS file is opened for updating. The correct
record is read in and then written out again with the new NEWS story.
At this point a letter is sent to DOER, telling it to go to work if it isn't already.
6. Before we return to 2., we must move the last record of the last news story
up to the top of the story buffer in preparation for writing it out the next
time.
DOER:
1. WAIT FOR MAIL: Doer is started with a letter from FILER, telling
it that there is another story to catagorize.
2. READ AN UNCATALOGED STORY FROM THE NEWS FILE: DOER first reads
in the INDEX file and grabs the special pointers NEW (pointer for next
incoming story in INDEX) and UNDUN (pointer into INDEX of first uncataloged
story). If UNDUN has caught up with NEW, DOER goes back to waiting for mail.
Otherwise, it reads in the uncatloged story from the NEWS file.
3. CHECK FOR ADDS AND TAKES, AND ALPHABETIZE WORDS IN EACH STORY: The news
is initially put in a buffer called STORY, but is soon moved, text word
by text word, into another buffer called TEXT. As the first words are being moved,
from STORY, they are also being checked for the special words "TAKE", and for
the number of another story (meaning that it is an add or correction).
If one of these occurs then the appropriate link is set in INDEX and we return
to the previous step. Otherwise, an array is built called
SORDID, which contains pointers into TEXT and links used to alphabetize the
words. When the last word is moved from STORY, all the words have been
alphabetized in SORDID.
4. LOOK FOR KEYWORDS IN STORY: Now the DICT file is read in and the
SORDID list is compared to it for duplications. If one is found, an entry
is made into LINKS and the pointer is set in the DICT file.
5. UPDATE INDEX FILE: The index file is read in and the UNDUN pointer is
advanced. If UNDUN ≠ NEW, DOER goes back to step 2. Otherwise it returns
to step 1.